Today I’m going to show you how to make a few simple plots and automated maps in R using the ICEWS and Phoenix event datasets. For the sake of this tutorial, I created a subset of the datasets that only contains 5 countries in the Caucasus region: Russia, Georgia, Azerbaijan, Armenia, and Turkey.

Note that Phoenix actually consists of three different subsets, New York Times (NYT), BBC Summary of World Broadcasts (SWB), and CIA Foreign Broadcast Information Service (FBIS), while Integrated Crisis Early Warning (ICEWS) is one single dataset.

First, to load the data and do just a little processing with dplyr

#### Read Datasets ####

# NYT
NYT <- read.csv(file ="data/NYT.csv") %>%
  mutate(date = as.Date(date)) %>%
  mutate(Dataset = "NYT")

# SWB
SWB<- read.csv(file ="data/SWB.csv") %>%
  mutate(date = as.Date(date)) %>%
  mutate(Dataset = "SWB")

# FBIS
FBIS <- read.csv(file ="data/FBIS.csv") %>%
  mutate(date = as.Date(date)) %>%
  mutate(Dataset = "FBIS")

# ICEWS
ICEWS <- read.csv(file ="data/ICEWS.csv") %>%
  mutate(date = as.Date(date)) %>%
  mutate(Dataset = "ICEWS")

Plotting with ggplot2

ggplot2 is both beautiful and terrible. It can produce very nice graphics, but is not super user friendly. ggplot2 is based off the idea of “grammar of graphics,” which is theoretically elegant way of dividing plots into data, visible geometry, and a coordinate system, but is so deep it can be easy to get lost.

Histogram basics

Anyway, let’s start with a basic histogram showing how many events over time there are in the NYT data. For this, we need to pass ggplot the data, tell it our visual mapping (date should be on the x axis), and then tell it what kind of geometry to make (with a separate function). Also note that y is implicit with a histogram, since it generates the y axis (count).

library(ggplot2, suppressPackageStartupMessages(TRUE))

# Take a look at the fields
colnames(NYT)
##  [1] "eid"           "story_date"    "year"          "month"        
##  [5] "day"           "source"        "source_root"   "source_agent" 
##  [9] "source_others" "target"        "target_root"   "target_agent" 
## [13] "target_others" "code"          "root_code"     "quad_class"   
## [17] "goldstein"     "joined_issues" "lat"           "lon"          
## [21] "placename"     "statename"     "countryname"   "aid"          
## [25] "process"       "date"          "cameo.root"    "Dataset"
# NYT data, aes (aesthetic) sets date field as x
ggplot(NYT, aes(x = date)) + 
  
  # set binwidth to cover 100 days
  geom_histogram( binwidth = 100)

That’s cool, but is still missing some important features like a title, and the aesthetic mapping could be better. The following is the type of customization you can do within ggplot. Note that scale_x_date is a specialized ggplot function to modify the display of dates, pretty crazy. Here we are letting ggplot figure out the limits itself, which is important to realize since our datasets have different temporal coverage.

ggplot(NYT, aes(x = date)) + 
  
  # set binwidth to cover 100 days, set transparent, blue
  geom_histogram(binwidth = 100,
           fill = "dodgerblue3") +
  
  # Change the breaks for the labels to better cover the area
  scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Change the labels to be a little intuitive
  labs( title = "Phoenix NYT Events per 100 days", 
        x = NULL,
        y = "100 Day Count") +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )

Note the two dips, one in 1961, and another in 1978. Let’s go ahead and zoom in on 1961 by changing our scale_x_date chunk to have a specific limit on the time period, and then reducing our binwidth to be finer-grained. Note that R wants this as a date datatype, requiring us to call as.Date to do a quick transform. We can also manipulate the spacing of the dates to better fit the time period.

ggplot(NYT, aes(x = date)) + 
  
  # set binwidth to cover 100 days, set transparent, blue
  geom_histogram(binwidth = 10,
           fill = "dodgerblue3") +
  
  # Set the limits to a specific time period
  scale_x_date(limits = c(as.Date("1960-01-01"), as.Date("1963-01-01") ), 
               date_breaks = "1 month",
               date_minor_breaks = "1 month",
               date_labels = "%y %b") +
     
  # Change the labels to be a little intuitive
  labs( title = " 1961 Phoenix NYT Events", 
        x = NULL,
        y = "10 Day Count") +
  
  # Note I changed the angle in element_text
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text( angle = 90) )

Hm, well that ain’t right. That, or nothing happened for big chunks of 1961.

Frequency Polgyons

This is just mapping the number of events over time, but we also have event types stored in the cameo.root field (Neutral, Verbal Cooperation/Conflict, and Material Cooperation/Conflict). Since we’re dealing with this a lot, we can make a custom color scheme to keep them consistent across the plots. Otherwise, ggplot will assign colors.

To map these to any color, all we have to do is tell ggplot that we want color mapped to cameo.root in the initial aes() block. If we want custom colors, we can make a variable to store the color values, then assign them with scale_color_manual

# Manually set colors for event types, called in ggmap later
event.color <- c("Neutral" = "gray60", 
          "Verbal cooperation" = "steelblue1",
          "Material cooperation"  = "dodgerblue3", 
          "Verbal conflict" = "salmon1", 
          "Material conflict" = "firebrick2")

We could try to plot the bar histograms like the total example, but since the data will overlap, the bars will cover each other up. It’s better use use a line equivalent, which in this context is confusingly called a frequency polygon.

# Set as variable so we can modify titles + labels on the fly
binwidth <- 365

ggplot(NYT, aes(x = date, color = cameo.root)) + 
  
  # Use frequency polygon; expand binwidth to variable days, set transparent
  geom_freqpoly(binwidth = binwidth, alpha = .8, size = 1) +
  
  # Change the breaks to better cover the area
  scale_x_date(date_breaks = "10 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Here is where we are assigning the colors we defined
  scale_color_manual(name = "Source", values = event.color) +
  
  # This resent the alpha to make the legend legible
  guides(color = guide_legend(override.aes = list(alpha = 1)))+
  
  # Change the labels to be a little intuitive
  labs( title = paste("Phoenix NYT Events per", binwidth, "days"), 
        x = NULL,
        y = paste(binwidth, "Day Count") ) +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )

Extending Histograms to Other Datasets

We can naturally make graphs like this for the other datasets. Note I am manually updating the color and the title for the total graph.

ggplot calls can get obnoxiously long and difficult to comment, so many people store the plot as a variable, then modify it through the variable. I don’t do this as much, but it is very common to see in official and unofficial documentation. Here I just do it for the first plot. Note that rmarkdown automagically generates the plot even if stored in a variable, but in R Studio you would not see a plot if it is stored and not called.

First, SWB:

#### Total SWB ####

# Store plot in a variable, then add to it
base <- ggplot(SWB, aes(x = date)) +
  
  # set binwidth to cover 100 days, set transparent, blue
  geom_histogram(binwidth = 100,
           fill = "darkorchid") 
  
  # Change the breaks to better cover the area
base + scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Change the labels to be a little intuitive
  labs( title = "Phoenix SWB Events per 100 days", 
        x = NULL,
        y = "100 Day Count") +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )
## Warning: Removed 111 rows containing non-finite values (stat_bin).

And now we can just call the plot object to get it to plot:

# Call the plot object to actually plot
base
## Warning: Removed 111 rows containing non-finite values (stat_bin).

Now to repeat the vent type plots we made with the NYT to a different dataset.

#### SWB by Type ####

# Start plot
ggplot(SWB, aes(x = date, color = cameo.root)) + 
  
  # Expand binwidth to variable days, set transparent
  geom_freqpoly(binwidth = binwidth, alpha = .8, size = 1) +
  
  # Change the breaks to better cover the area
  scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Here is where we are assigning the colors we defined
  scale_color_manual(name = "Source", values = event.color) +
  
  # This resent the alpha to make the legend legible
  guides(color = guide_legend(override.aes = list(alpha = 1)))+
  
  # Change the labels to be a little intuitive
  labs( title = paste("Phoenix SWB Events per", binwidth, "days"), 
        x = NULL,
        y = paste(binwidth, "Day Count") ) +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )

Now for the FBIS data- notice that there is a spurious event lurking somewhere significantly before coverage actually starts.

#### Total FBIS ####

# Start plot
ggplot(FBIS, aes(x = date)) + 
  
  # set binwidth to cover 100 days, set transparent, blue
  geom_histogram(binwidth = 100,
           fill = "cyan4") +
  
  # Change the breaks to better cover the area
  scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Change the labels to be a little intuitive
  labs( title = "Phoenix FBIS Events per 100 days", 
        x = NULL,
        y = "100 Day Count") +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )

#### FBIS by Type ####

ggplot(FBIS, aes(x = date, color = cameo.root)) + 
  
  # Expand binwidth to variable days, set transparent
  geom_freqpoly(binwidth = binwidth, alpha = .8, size = 1) +
  
  # Change the breaks to better cover the area
  scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Here is where we are assigning the colors we defined
  scale_color_manual(name = "Source", values = event.color) +
  
  # This resent the alpha to make the legend legible
  guides(color = guide_legend(override.aes = list(alpha = 1)))+
  
  # Change the labels to be a little intuitive
  labs( title = paste("Phoenix FBIS Events per", binwidth, "days"), 
        x = NULL,
        y = paste(binwidth, "Day Count") ) +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )

Finally, ICEWS. Note that the field names in ICEWS are different (CAMEO.root vs cameo.root).

#### Total ICEWS ####

# Start Plot
ggplot(ICEWS, aes(x = date)) + 
  
  # set binwidth to cover 100 days, set transparent, blue
  geom_histogram(binwidth = 100,
           fill = "firebrick") +
  
  # Change the breaks to better cover the area
  scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Change the labels to be a little intuitive
  labs( title = "ICEWS Events per 100 days", 
        x = NULL,
        y = "100 Day Count") +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )

#### ICEWS by Type ####

ggplot(ICEWS, aes(x = date, color = CAMEO.root)) + 
  
  # Expand binwidth to variable days, set transparent
  geom_freqpoly(binwidth = binwidth, alpha = .8, size = 1) +
  
  # Change the breaks to better cover the area
  scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") +
  
  # Here is where we are assigning the colors we defined
  scale_color_manual(name = "Source", values = event.color) +
  
  # This resent the alpha to make the legend legible
  guides(color = guide_legend(override.aes = list(alpha = 1)))+
  
  # Change the labels to be a little intuitive
  labs( title = paste("ICEWS Events per", binwidth, "days"), 
        x = NULL,
        y = paste(binwidth, "Day Count") ) +
  
  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) )

Plotting Multiple Datasets

These illustrate the distribution of each of the datasets over time, but we have to keep in mind the axes are changing depending on the data, so while we get a good view of the internal variation, we can’t see how the datasets compare to one another.

The preferable approach for ggplot is to format your data to death before plotting, such as aggregating all the data you want to plot into a single data frame, like how we have the event type stored within a field. However, it is possible to over-plot separate data sources onto a single ggplot, it just takes a little bit of effort. ggplot can be called with an empty data frame, and the data designated for each of the geometry functions separately.

Here I am also storing the plot separately because I want to make a version with a different date scale.

# store our color scheme for the different sources
source.color <- c("NYT" = "dodgerblue3", 
             "SWB" = "darkorchid",
             "FBIS" = "cyan4", 
             "ICEWS" = "firebrick")

# Start plot with empty data frame, since we'll add the data separately , store for later
base.plot <-  ggplot(data.frame(), aes(color = source.color)) + 
  
  # NYT
  geom_freqpoly(data = NYT, aes(x=date, color = "NYT"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6, show.legend = TRUE) +
  
  # SWB
  geom_freqpoly(data = SWB, aes(x=date, color = "SWB"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6) +
  
  # FBIS
  geom_freqpoly(data = FBIS, aes(x=date, color = "FBIS"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6) +
  
  # ICEWS
  geom_freqpoly(data = ICEWS, aes(x=date, color = "ICEWS"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6) +

  # Here is where we are assigning the colors we defined
  scale_color_manual(name = "Source", values = source.color) +
  
  # This resent the alpha to make the legend legible
  guides(color = guide_legend(override.aes = list(alpha = 1)))+
  
  # Change the labels to be a little intuitive
  labs( title = "Events Over Time", 
        x = NULL,
        y = "100 Day Count") +

  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) ) +
     
  scale_x_date(date_breaks = "5 years",
               date_minor_breaks = "1 year",
               date_labels = "%Y") 

# Call the plot
base.plot

It would also make sense to look at the same plot, but only for the dates where there is significant overlap. Since we are especially interested in how ICEWS stacks up, we will start when it starts, 1995. We can add to or override the previously set parameters, and ggplot will warn you if a value is changed. Check out the warnings for all the data being excluded.

  ## Changed the start date, WHOOOO!!  
  base.plot + 
  scale_x_date(limits = c(as.Date("1995-01-01"), NA), 
               date_breaks = "2 year",
               date_minor_breaks = "1 year",
               date_labels = "%Y")
## Scale for 'x' is already present. Adding another scale for 'x', which
## will replace the existing scale.
## Warning: Removed 22383 rows containing non-finite values (stat_bin).
## Warning: Removed 26039 rows containing non-finite values (stat_bin).
## Warning: Removed 7 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing missing values (geom_path).

## Warning: Removed 2 rows containing missing values (geom_path).

## Warning: Removed 2 rows containing missing values (geom_path).

## Warning: Removed 2 rows containing missing values (geom_path).

We can also subset the data within the plot calls, which lets us make a comparison according to that subset between the datasets. In this case, I am only looking at the “Neutral” events across the different datasets.

# Start plot with empty data frame, since we'll add the data separately , store for later
base.plot <-  ggplot(data.frame(), aes(color = source.color)) + 
  
  # NYT
  geom_freqpoly(data = NYT[NYT$cameo.root == "Neutral",], aes(x=date, color = "NYT"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6, show.legend = TRUE) +
  
  # SWB
  geom_freqpoly(data = SWB[SWB$cameo.root == "Neutral",], aes(x=date, color = "SWB"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6) +
  
  # FBIS
  geom_freqpoly(data = FBIS[FBIS$cameo.root == "Neutral",], aes(x=date, color = "FBIS"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6) +
  
  # ICEWS: Note cameo field name is different
  geom_freqpoly(data = ICEWS[ICEWS$CAMEO.root == "Neutral",], aes(x=date, color = "ICEWS"), 
                binwidth = 100,
                size = 1.2,
                alpha = .6) +

  # Here is where we are assigning the colors we defined
  scale_color_manual(name = "Source", values = source.color) +
  
  # This resent the alpha to make the legend legible
  guides(color = guide_legend(override.aes = list(alpha = 1)))+
  
  # Change the labels to be a little intuitive
  labs( title = "Neutral Events Over Time: Dataset Comparison", 
        x = NULL,
        y = "100 Day Count") +

  # Make the fonts bigger, boldface the title
  theme( title = element_text(size = 14, face = "bold"),
         axis.text = element_text(size = 12) ) +

  # Limit date range
  scale_x_date(limits = c(as.Date("1995-01-01"), NA), 
               date_breaks = "2 year",
               date_minor_breaks = "1 year",
               date_labels = "%Y") 

# Call the plot
base.plot

Mapping Event Data

Okay, so graphs are cool, but maps are neater!

ggplot has some limited ability to deal with geographic data. Here are the points from SWB plotted with an orthographic projection, using the ggplot coord_map and borders function. It’s not well suited the sort of fine-grained point data we have, however.

ggplot(SWB, aes(x = lon, y = lat, color = cameo.root )) +
  
  borders("world", color = "gray80", fill = "gray40") +
  
  # plot the points
  geom_point(alpha = .6) + 
     
  # Assign our color scheme
  scale_color_manual(name = "Event Type", values = event.color) +
  
  # This resent the alpha to make the legend legible
  guides(color = guide_legend(override.aes = list(alpha = 1)))+

  # ggplot map coordinate transform 
  coord_map(projection = "ortho", orientation = c( 45, 80, 0 ))  +
  
  # Change the labels to be a little intuitive
  labs( title = "SWB Events", 
        x = "Degrees East/West",
        y = "Degrees North/South")

Mapping the Google Way

In order to create more detailed maps, we need more detailed data. It is possible to download all sorts of geographic data, but we can also use Google Maps service to provide us with base layers, which we can then just plot on top of.

Let’s start with Georgia. First, a little setup. Within Phoenix, countries are recorded using their ISO 3-letter codes, which can be converted to the full names using a package.

# Store Georgia's country code to make things easier
country <- "GEO"

# create subset for Georgia
c.subset <- SWB %>% filter(countryname == country)

# Load countrycode package to convert country code to full name
library(countrycode, suppressPackageStartupMessages(TRUE))

# Get full name of country
long.name <- countrycode(country, "iso3c", "country.name")

Next, we need to load ggmap. We can use the points to create a bounding box which will define the extent of the map. The package warns you doing it this way is experimental, and it will break in some situations, particularly data with global coverage or geometry above 80 degrees North or South.

# Load ggmap package
library(ggmap, suppressPackageStartupMessages(TRUE))

# Get bounding box around points to pass to get_map
bbox <- make_bbox(lon, lat, c.subset, f = .5)

# Get map for bounding box from google maps service, black and white
map <- get_map(bbox, color = "bw")
## Warning: bounding box given to google - spatial extent only approximate.
## converting bounding box to center/zoom specification. (experimental)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=42.31665,43.178055&zoom=7&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
# Create map with ggmap
ggmap(map)

ggmap is set up to behave essentially like a ggplot object, letting us apply all our ggplot variables to the map as if it were any other plot.

To better portray the different events, let’s set up more of our visual variables. We’re going to be mapping the CAMEO class of each of the points, which again tells us what kind of event it is (Neutral, Verbal Cooperation/Conflict, and Material Cooperation/Conflict).

The particular visual challenge here is over-plotting of symbols, since we can expect multiple events in one recorded location (i.e., a city center).

# Manually set shapes for event types, called in ggmap later
event.shape <- c("Neutral" = 3, 
               "Verbal cooperation" = 1,
               "Material cooperation"  = 16, 
               "Verbal conflict" = 2, 
               "Material conflict" = 17)

# Manually set sizes for event types, called in ggmap later
event.size <- c("Neutral" = 1, 
                "Verbal cooperation" = 2,
                "Material cooperation"  = 1.5, 
                "Verbal conflict" = 2, 
                "Material conflict" = 1.5)

Cool, now let’s get mapping! Note that it is really just like ggplot. We also have to add our visual variable mapping separately with specific functions.

# call ggmap,  tint white
ggmap(map, darken = c(0.6, "white")) + 
     
     # Add points, map to variables
     geom_point(data = c.subset, 
               aes(  x = lon,
                     y = lat,
                     color = cameo.root,
                     shape = cameo.root,
                     size = cameo.root),
             alpha = .8) + 
     
     # Add size mapping
     scale_size_manual(values = event.size) +
     
     # Add color mapping
     scale_colour_manual(values = event.color) +
     
     # Add shape mapping
     scale_shape_manual(values = event.shape) +
     
     # Add label
     labs( title = paste("SWB", long.name))+
     
     # This resent the alpha to make the legend legible
     guides(color = guide_legend(override.aes = list(alpha = 1)))

Automated Cartography: Looping Maps

We can also loop this pretty easily and make a map for many countries iteratively, which is useful for these global coverage datasets. While you would want to do more customization if you were aiming to publish (the automation isn’t perfect), they work great for basic visualization. Here I just use SWB, but this will work with the other datasets (ICEWS would require some renaming since the field names are different).

While we can grab the country names automatically, we have to exclude Russia since it will not be properly mapped by ggmap. Essentially, ggmap breaks with locations too far North or South, which has to do with the map projection used which has severe distortion near the poles.

Also note that this loop takes a while to run, since it has to query the Google Maps API. If you are making a series of maps, it is better to query the API for the base map, then use that base map multiple times for your different maps. You can totally get API request denials if you query too fast and too often.

# Set as a variable, so it's easy to swap out
title.dataset <- "SWB"

# Get all the countries in the data 
countries <- as.character(unique(SWB$countryname))

# Russia's extent breaks automated mapping, sorry Russia :(
countries <- countries[countries != "RUS"]

# Start our loop
for (i in 1:length(countries)){
    
     # Assign by index, sets up everythign else
     country <- countries[i]
     
     # get full name using countrycode package
     long.name <- countrycode(country, "iso3c", "country.name")
     
     # subset dataset by country name
     c.subset <- SWB %>% filter(countryname == country)
     
     # Get bounding box, fudge factor .4
     bbox <- make_bbox(lon, lat, c.subset, f = .4)
     
     # Get map for bounding box
     map <- get_map(bbox, color = "bw")
     
     # Set subtitle
     sub <- "Events by Type"
    
    # Plots events by type
    type.map.print <- ggmap(map, darken = c(0.7, "white")) + 
         
         # Add points, map to variables
         geom_point(data = c.subset, 
                          aes(  x = lon,
                                y = lat, 
                                color = cameo.root, 
                                shape = cameo.root,
                                size = cameo.root), 
                          alpha = .8) + 
          # Add size mapping
          scale_size_manual(values = event.size) +
          
          # Add color mapping
          scale_colour_manual(values = event.color) +
          
          # Add shape mapping
          scale_shape_manual(values = event.shape) +
          
          # Add labels by combining the long.name with dataset title
          labs( title = paste(title.dataset, long.name),
              subtitle = sub) 
          
     # Print is needed to show ggplots created in loops
     print(type.map.print)

}
## converting bounding box to center/zoom specification. (experimental)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=38.968215,35.33298&zoom=6&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## converting bounding box to center/zoom specification. (experimental)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.129225,47.86404&zoom=7&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false

## converting bounding box to center/zoom specification. (experimental)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=40.07325,45.1096&zoom=8&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false

## converting bounding box to center/zoom specification. (experimental)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=42.31665,43.178055&zoom=7&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false

Mapping Multiple Datasets

As with the ggplot histogram comparing the different datasets over time, we can do essentially the same thing with ggmap. This time however, let’s make a map of the entire region.

We can do this by first making a map that fits our desired area and then mapping our datasets on top. ggmap will only plot points within the map extent, so we don’t have to worry about manual clipping. To find our extent, we can run get_map by giving it coordinates for the center of our area and a zoom factor.

# Get our base map from Google
aoi.map <- get_map( location = c(lon = 44, lat = 43), zoom = 6, color = "bw")
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=43,44&zoom=6&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false

Let’s also set up a event shapes, since we’ll have the same problem with over-plotting here. Note that you get several warning messages because of the points outside our map area not being plotted.

# Manually set shapes for event types, called in ggmap later
event.shape <- c("NYT" = 1, 
                  "SWB" = 5,
                  "FBIS" = 0, 
                  "ICEWS" = 18)

# Plot map, with white tint
ggmap(aoi.map, darken = c(0.6, "white")) +
     
     # Add NYT points with right color
     geom_point(data = NYT[NYT$cameo.root == "Neutral",], 
                aes(x = lon,
                    y = lat,
                    shape = "NYT",
                    color = "NYT")) +
     
     # Add SWB points with right color
     geom_point(data = SWB[SWB$cameo.root == "Neutral",], 
                aes(x = lon,
                    y = lat,
                    shape = "SWB",
                    color = "SWB")) +
     
     # FBIS points with right color
     geom_point(data = FBIS[FBIS$cameo.root == "Neutral",],
                aes(x = lon,
                    y = lat,
                    shape = "FBIS",
                    color = "FBIS")) +
     
     # ICEWS: Note cameo field abd location names are different
     geom_point(data = ICEWS[ICEWS$CAMEO.root == "Neutral",], 
                aes(x = Longitude,
                    y = Latitude,
                    shape = "ICEWS",
                    color = "ICEWS")) +
        
     # Assign source colors, defined earlier
     scale_color_manual(name = "Source", values = source.color ) +
     
     # Add shape mapping
     scale_shape_manual( name = "Source", values = event.shape ) +
     
     # Add label
     labs( title = "Caucasus Neutral Events by Source" )
## Warning: Removed 5821 rows containing missing values (geom_point).
## Warning: Removed 49869 rows containing missing values (geom_point).
## Warning: Removed 14210 rows containing missing values (geom_point).
## Warning: Removed 2192 rows containing missing values (geom_point).

Conclusion

ggplot is great for data visualization, and ggmap is a great extension that lets you easily create maps using Google Maps base data.